Search CORE

86 research outputs found

Applying test case prioritization to software microbenchmarks

Author: Gall Harald C.
Laaber Christoph
Leitner Philipp
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2021
Field of study

Regression testing comprises techniques which are applied during software evolution to uncover faults effectively and efficiently. While regression testing is widely studied for functional tests, performance regression testing, e.g., with software microbenchmarks, is hardly investigated. Applying test case prioritization (TCP), a regression testing technique, to software microbenchmarks may help capturing large performance regressions sooner upon new versions. This may especially be beneficial for microbenchmark suites, because they take considerably longer to execute than unit test suites. However, it is unclear whether traditional unit testing TCP techniques work equally well for software microbenchmarks. In this paper, we empirically study coverage-based TCP techniques, employing total and additional greedy strategies, applied to software microbenchmarks along multiple parameterization dimensions, leading to 54 unique technique instantiations. We find that TCP techniques have a mean APFD-P (average percentage of fault-detection on performance) effectiveness between 0.54 and 0.71 and are able to capture the three largest performance changes after executing 29% to 66% of the whole microbenchmark suite. Our efficiency analysis reveals that the runtime overhead of TCP varies considerably depending on the exact parameterization. The most effective technique has an overhead of 11% of the total microbenchmark suite execution time, making TCP a viable option for performance regression testing. The results demonstrate that the total strategy is superior to the additional strategy. Finally, dynamic-coverage techniques should be favored over static-coverage techniques due to their acceptable analysis overhead; however, in settings where the time for prioritzation is limited, static-coverage techniques provide an attractive alternative

Chalmers Research

Applying test case prioritization to software microbenchmarks

Author: Gall Harald C
Laaber Christoph
Leitner Philipp
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/11/2021
Field of study

ZORA

On-the-Fly Syntax Highlighting using Neural Networks

Author: Gall Harald C.
Palma Marco Edoardo
Salza Pasquale
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 04/08/2022
Field of study

With the presence of online collaborative tools for software developers, source code is shared and consulted frequently, from code viewers to merge requests and code snippets. Typically, code highlighting quality in such scenarios is sacrificed in favor of system responsiveness. In these on-the-fly settings, performing a formal grammatical analysis of the source code is not only expensive, but also intractable for the many times the input is an invalid derivation of the language. Indeed, current popular highlighters heavily rely on a system of regular expressions, typically far from the specification of the language's lexer. Due to their complexity, regular expressions need to be periodically updated as more feedback is collected from the users and their design unwelcome the detection of more complex language formations. This paper delivers a deep learning-based approach suitable for on-the-fly grammatical code highlighting of correct and incorrect language derivations, such as code viewers and snippets. It focuses on alleviating the burden on the developers, who can reuse the language's parsing strategy to produce the desired highlighting specification. Moreover, this approach is compared to nowadays online syntax highlighting tools and formal methods in terms of accuracy and execution time, across different levels of grammatical coverage, for three mainstream programming languages. The results obtained show how the proposed approach can consistently achieve near-perfect accuracy in its predictions, thereby outperforming regular expression-based strategies.Comment: Accepted for publication in the ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2022

arXiv.org e-Print Archive

Automated Reporting of Anti-Patterns and Decay in Continuous Integration

Author: Di Penta Massimiliano
Gall Harald C
Proksch Sebastian
Vassallo Carmine
Publication venue: IEEE / ACM
Publication date: 31/05/2019
Field of study

Continuous Integration (CI) is a widely-used software engineering practice. The software is continuously built so that changes can be easily integrated and issues such as unmet quality goals or style inconsistencies get detected early. Unfortunately, it is not only hard to introduce CI into an existing project, but it is also challenging to live up to the CI principles when facing tough deadlines or business decisions. Previous work has identified common anti-patterns that reduce the promised benefits of CI. Typically, these anti-patterns slowly creep into a project over time before they are identified. We argue that automated detection can help with early identification and prevent such a process decay. In this work, we further analyze this assumption and survey 124 developers about CI anti-patterns. From the results, we build CI-Odor, a reporting tool for CI processes that detects the existence of four relevant anti-patterns by analyzing regular build logs and repository information. In a study on the 18,474 build logs of 36 popular Java projects, we reveal the presence of 3,823 high-severity warnings spread across projects. We validate our reports in a survey among 13 original developers of these projects and through general feedback from 42 developers that confirm the relevance of our reports

Crossref

ZORA

A large-scale empirical exploration on refactoring activities in open source software projects

Author: Bacchelli Alberto
Gall Harald C
Grano Giovanni
Palomba Fabio
Vassallo Carmine
Publication venue: 'Elsevier BV'
Publication date: 01/07/2019
Field of study

Refactoring is a well-established practice that aims at improving the internal structure of a software system without changing its external behavior. Existing literature provides evidence of how and why developers perform refactoring in practice. In this paper, we continue on this line of research by performing a large-scale empirical analysis of refactoring practices in 200 open source systems. Specifically, we analyze the change history of these systems at commit level to investigate: (i) whether developers perform refactoring operations and, if so, which are more diffused and (ii) when refactoring operations are applied, and (iii) which are the main developer-oriented factors leading to refactoring. Based on our results, future research can focus on enabling automatic support for less frequent refactorings and on recommending refactorings based on the developer's workload, project's maturity and developer's commitment to the project

Crossref

ZORA

Exploiting natural language structures in software informal documentation

Author: Canfora Gerardo
Di Penta Massimiliano
Di Sorbo Andrea
Gall Harald C.
Panichella Sebastiano
Visaggio Corrado Aaron
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

© 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Communication means, such as issue trackers, mailing lists, Q&A forums, and app reviews, are premier means of collaboration among developers, and between developers and end-users. Analyzing such sources of information is crucial to build recommenders for developers, for example suggesting experts, re-documenting source code, or transforming user feedback in maintenance and evolution strategies for developers. To ease this analysis, in previous work we proposed DECA (Development Emails Content Analyzer), a tool based on Natural Language Parsing that classifies with high precision development emails' fragments according to their purpose. However, DECA has to be trained through a manual tagging of relevant patterns, which is often effort-intensive, error-prone and requires specific expertise in natural language parsing. In this paper, we first show, with a study involving Master's and Ph.D. students, the extent to which producing rules for identifying such patterns requires effort, depending on the nature and complexity of patterns. Then, we propose an approach, named NEON (Nlp-based softwarE dOcumentation aNalyzer), that automatically mines such rules, minimizing the manual effort. We assess the performances of NEON in the analysis and classification of mobile app reviews, developers discussions, and issues. NEON simplifies the patterns' identification and rules' definition processes, allowing a savings of more than 70% of the time otherwise spent on performing such activities manually. Results also show that NEON-generated rules are close to the manually identified ones, achieving comparable recall

ZHAW digitalcollection

Branch coverage prediction in automated testing

Author: Gall Harald C.
Grano Giovanni
Panichella Sebastiano
Titov Timofey V.
Publication venue: Wiley
Publication date: 01/01/2019
Field of study

This is the peer reviewed version which has been published in final form at [DOI]. This article may be used for non-commercial purposes in accordance with Wiley Terms and Conditions for Use of Self-Archived Versions.Software testing is crucial in continuous integration (CI). Ideally, at every commit, all the test cases should be executed, and moreover, new test cases should be generated for the new source code. This is especially true in a Continuous Test Generation (CTG) environment, where the automatic generation of test cases is integrated into the continuous integration pipeline. In this context, developers want to achieve a certain minimum level of coverage for every software build. However, executing all the test cases and, moreover, generating new ones for all the classes at every commit is not feasible. As a consequence, developers have to select which subset of classes has to be tested and/or targeted by test‐case generation. We argue that knowing a priori the branch coverage that can be achieved with test‐data generation tools can help developers into taking informed decision about those issues. In this paper, we investigate the possibility to use source‐code metrics to predict the coverage achieved by test‐data generation tools. We use four different categories of source‐code features and assess the prediction on a large data set involving more than 3'000 Java classes. We compare different machine learning algorithms and conduct a fine‐grained feature analysis aimed at investigating the factors that most impact the prediction accuracy. Moreover, we extend our investigation to four different search budgets. Our evaluation shows that the best model achieves an average 0.15 and 0.21 MAE on nested cross‐validation over the different budgets, respectively, on EVOSUITE and RANDOOP. Finally, the discussion of the results demonstrate the relevance of coupling‐related features for the prediction accuracy

ZHAW digitalcollection

ZORA

Telomerecat: A ploidy-agnostic method for estimating telomere length from whole genome sequencing data.

Author: Afzal Maryam
Aitman Timothy
Alachkar Hana
Alavijeh Omid S.
Ali Sonia
Ali Souad
Allen Louise
Allsup David
Ambegaonkar Gautum
Ancliff Phil
Anderson Julie
Antrobus Richard
Armstrong Ruth
Arno Gavin
Arumugakani Gururaj
Ashford Sofie
Astle William
Attwood Antony
Austin Steve
Babbs Christian
Bacchelli Chiara
Bakchoul Tamam
Bariana Tadbir K.
Baxendale Helen
Bennett David
Bethune Claire
Bibi Shahnaz
Bitner-Glindzicz Maria
Bleda Marta
Boggard Harm J.
Bolton-Maggs Paula
Booth Claire
Bradley John R.
Brady Angie
Brown Matthew
Browning Michael
Bryson Christine
Burns Siobhan
Calleja Paul
Canham Natalie
Carmichael Jenny
Carss Keren
Caulfield Mark
Chalmers Elizabeth
Chandra Anita
Chinnery Patrick
Chitre Manali
Chong Sam
Church Colin
Clement Emma
Clements-Brod Naomi
Clowes Virginia
Coghlan Gerry
Colby Elizabeth
Collins Janine
Collins Peter
Cook H. Terry
Cookson Victoria
Cooper Nichola
Corris Paul A.
Creaser-Myers Amanda
DaCosta Rosa
Daugherty Louise
Davies Sophie
Davis John
De Vries Minka
Deegan Patrick
Deevi Sri V. V.
Deshpande Charu
Devlin Lisa
Dewhurst Eleanor
Dixon Peter
Doffinger Rainer
Dolling Helen
Dormand Natalie
Drewe Elizabeth
Edgar David
Egner William
Emmerson Ingrid
Erber Wendy N.
Erwood Marie
Everington Tamara
Eyries Mélanie
Farmery James H. R.
Favier Remi
Firth Helen
Fletcher Debra
Flinter Frances
Frary Amy
French Courtney
Freson Kathleen
Furie Bruce
Furnell Abigail
Gale Daniel
Gall Henning
Gardham Alice
Gattens Michael
Gebhart Johanna
Geet Chris Van
Ghali Neeti
Ghataorhe Pavandeep K.
Ghio Stefano
Ghofrani Ardi
Ghurye Rohit
Gibbs J. Simon R.
Gilmour Kimberley
Ginsberg Lionel
Girerd Barbara
Gissen Paul
Goddard Sarah
Gomez Keith
Gordins Pavel
Gosal David
Greene Daniel
Greenhalgh Alan
Greinacher Andreas
Gresele Paolo
Grigoriadou Sofia
Grozeva Detelina
Gräf Stefan
Hackett Scott
Hadden Rob
Hadinnapola Charaka
Hague Rosie
Haimel Matthias
Halmagyi Csaba
Hammerton Tracey
Harper Lorraine
Hart Daniel
Hayman Grant
Heemskerk Johan W. M.
Henderson Robert
Hensiek Anke
Henskens Yvonne
Herwadkar Archana
Holden Simon
Holder Muriel
Holder Susan
Horvath Rita
Houweling Arjan C.
Hu Fengyuan
Hudson Gavin
Huissoon Aarnoud
Humbert Marc
Hurst Jane
James Roger
Johnson Sally
Jolles Stephen
Josifova Dragana
Kazmi Rashid
Keeling David
Kelleher Peter
Kelly Anne M.
Kennedy Fiona
Kiely David G.
Kingston Nathalie
Kovacs Gabor
Koziell Ania
Krishnakumar Deepa
Kuijpers Taco
Kuijpers Taco W.
Kumararatne Dinakantha
Kurian Manju
Laffan Michael A.
Lambert Michele P.
Lango Allen Hana
Lango-Allen Hana
Lawrie Allan
Layton Mark
Lear Sara
Lees Melissa
Lentaigne Claire
Levine Adam P.
Liesner Ri
Linger Rachel
Longhurst Hilary
Lorenzo Lorena
Louka Eleni
Lynch Andy G.
Machado Rajiv
MackenzieRoss Rob V.
MacLaren Robert
Mahdi-Rogers Mohamed
Maher Eamonn
Maimaris Jesmeen
Makris Mike
Man Patrick Yu Wai
Mangles Sarah
Manson Ania
Manzur Adnan
Mapeta Rutendo
Markus Hugh S.
Marshall Andrew
Martin Jennifer
Martin Jennifer M.
Masati Larahmie
Mathias Mary
Matser Vera
Matthews Emma
Maw Anna
McCarthy Mark
McDermott Elizabeth
McGowan Simon
McJannet Coleen
Meacham Stuart
Mead Adam
Meehan Sharon
Megy Karyn
Mehta Sarju
Michaelides Michel
Millar Carolyn M.
Moledina Shahin
Montani David
Moore Anthony
Morrell Nicholas W.
Mumford Andrew
Murng Sai
Murphy Elaine
Nejentsev Sergey
Noorani Sadia
Nurden Paquita
Oksenhendler Eric
Ormondroyd Liz
Othman Shokri
Ouwehand Willem H.
Papadia Sofia
Park Soo-Mi
Parker Alasdair
Pasi John
Patch Chris
Paterson Joan
Payne Jeanette
Peacock Andrew J.
Peerlinck Kathelijne
Penkett Christopher J.
Pepke-Zaba Joanna
Perry David J.
Pollock Val
Polwarth Gary
Ponsford Mark
Qasim Waseem
Quinti Isabella
Ranganathan Lavanya
Rankin Julia
Rankin Stuart
Raymond F. Lucy
Rayner-Matthews Paula
Rehnstrom Karola
Reid Evan
Reilly Mary
Renton Tara
Revel-Vilk Shoshana
Rhodes Christopher J.
Rice Andrew
Richards Michael
Richardson Sylvia
Richter Alex
Roberts Irene
Rondina Matthew
Rosser Elisabeth
Roughley Catherine
Roy Noémi
Rue-Albrecht Kevin
Saleem Moin
Samarghitean Crina
Sanchis-Juan Alba
Sandford Richard
Santra Saikat
Sargur Ravishankar
Savic Sinisa
Scelsi Laura
Schotte Gwen
Schulman Sol
Schulze Harald
Scott Richard
Scully Marie
Seneviratne Suranjith
Sewell Carrock
Shamardina Olga
Shipley Debbie
Simeoni Ilenia
Sivapalaratnam Suthesh
Smith Kenneth G. C.
Smith Mike L.
Sohal Aman
Soubrier Florent
Southgate Laura
Staines Simon
Staples Emily
Stark Hannah
Stauss Hans
Stein Penelope
Stephens Jonathan
Stirrups Kathleen
Stock Sophie
Stubbs Matthew
Suntharalingam Jay
Tait R. Campbell
Talks Kate
Tan Rhea
Tan Yvonne
Thachil Jecko
Thaventhiran James
Themistocleous Andreas
Thomas Ellen
Thomas Moira
Thompson Dorothy
Thrasher Adrian
Tischkowitz Marc
Titterton Catherine
Toh Cheng-Hock
Toshner Mark
Treacy Carmen M.
Trembath Richard
Tuna Salih
Turek Wojciech
Turro Ernest
Vale Tom
Veld Anna Huis in’t
Veltman Marijke
Vogt Julie
von Ziegenweldt Julie
Vonk Noordegraaf Anton
Wakeling Emma
Walker Sara
Walker Suellen
Wanjiku Ivy
Warner Timothy Q.
Wassmer Evangeline
Watkins Hugh
Watson Henry
Watt Christopher
Webster Andrew
Wei Wei
Welch Steve
Westbury Sarah
Wharton John
Whitehorn Deborah
Whitworth James
Wilkins Martin
Willcocks Lisa
Williamson Catherine
Wong Edwin K. S.
Woods Geoff
Wort Stephen J.
Yates Katherine
Yeatman Nigel
Yong Patrick
Young Tim
Yu Ping
Zuydam Natalie Van
Publication venue: Sci Rep
Publication date: 01/01/2018
Field of study

Telomere length is a risk factor in disease and the dynamics of telomere length are crucial to our understanding of cell replication and vitality. The proliferation of whole genome sequencing represents an unprecedented opportunity to glean new insights into telomere biology on a previously unimaginable scale. To this end, a number of approaches for estimating telomere length from whole-genome sequencing data have been proposed. Here we present Telomerecat, a novel approach to the estimation of telomere length. Previous methods have been dependent on the number of telomeres present in a cell being known, which may be problematic when analysing aneuploid cancer data and non-human samples. Telomerecat is designed to be agnostic to the number of telomeres present, making it suited for the purpose of estimating telomere length in cancer studies. Telomerecat also accounts for interstitial telomeric reads and presents a novel approach to dealing with sequencing errors. We show that Telomerecat performs well at telomere length estimation when compared to leading experimental and computational methods. Furthermore, we show that it detects expected patterns in longitudinal data, repeated measurements, and cross-species comparisons. We also apply the method to a cancer cell data, uncovering an interesting relationship with the underlying telomerase genotype

University of Liverpool Repository

Directory of Open Access Journals

White Rose Research Online

Maastricht University Research Portal

Harvard University - DASH

Oxford University Research Archive

Apollo (Cambridge)

King's Research Portal

St George's Online Research Archive

Explore Bristol Research

St Andrews Research Repository

Publisher Correction: Telomerecat: A ploidy-agnostic method for estimating telomere length from whole genome sequencing data.

Author: Afzal Maryam
Aitman Timothy
Alachkar Hana
Alavijeh Omid S
Ali Sonia
Ali Souad
Allen Hana Lango
Allen Louise
Allsup David
Ambegaonkar Gautum
Ancliff Phil
Anderson Julie
Antrobus Richard
Armstrong Ruth
Arno Gavin
Arumugakani Gururaj
Ashford Sofie
Astle William
Attwood Antony
Austin Steve
Babbs Christian
Bacchelli Chiara
Bakchoul Tamam
Bariana Tadbir K
Baxendale Helen
Bennett David
Bethune Claire
Bibi Shahnaz
Bitner-Glindzicz Maria
Bleda Marta
Boggard Harm J
Bolton-Maggs Paula
Booth Claire
Bradley John R
Brady Angie
Brown Matthew
Browning Michael
Bryson Christine
Burns Siobhan
Calleja Paul
Canham Natalie
Carmichael Jenny
Carss Keren
Caulfield Mark
Chalmers Elizabeth
Chandra Anita
Chinnery Patrick
Chitre Manali
Chong Sam
Church Colin
Clement Emma
Clements-Brod Naomi
Clowes Virginia
Coghlan Gerry
Colby Elizabeth
Collins Janine
Collins Peter
Cook H Terry
Cookson Victoria
Cooper Nichola
Corris Paul A
Creaser-Myers Amanda
DaCosta Rosa
Daugherty Louise
Davies Sophie
Davis John
De Vries Minka
Deegan Patrick
Deevi Sri VV
Deshpande Charu
Devlin Lisa
Dewhurst Eleanor
Dixon Peter
Doffinger Rainer
Dolling Helen
Dormand Natalie
Drewe Elizabeth
Edgar David
Egner William
Emmerson Ingrid
Erber Wendy N
Erwood Marie
Everington Tamara
Eyries Melanie
Farmery James HR
Favier Remi
Firth Helen
Fletcher Debra
Flinter Frances
Frary Amy
French Courtney
Freson Kathleen
Furie Bruce
Furnell Abigail
Gale Daniel
Gall Henning
Gardham Alice
Gattens Michael
Gebhart Johanna
Ghali Neeti
Ghataorhe Pavandeep K
Ghio Stefano
Ghofrani Ardi
Ghurye Rohit
Gibbs J Simon R
Gilmour Kimberley
Ginsberg Lionel
Girerd Barbara
Gissen Paul
Goddard Sarah
Gomez Keith
Gordins Pavel
Gosal David
Graf Stefan
Greene Daniel
Greenhalgh Alan
Greinacher Andreas
Gresele Paolo
Grigoriadou Sofia
Grozeva Detelina
Hackett Scott
Hadden Rob
Hadinnapola Charaka
Hague Rosie
Haimel Matthias
Halmagyi Csaba
Hammerton Tracey
Harper Lorraine
Hart Daniel
Hayman Grant
Heemskerk Johan WM
Henderson Robert
Hensiek Anke
Henskens Yvonne
Herwadkar Archana
Holden Simon
Holder Muriel
Holder Susan
Horvath Rita
Houweling Arjan C
Hu Fengyuan
Hudson Gavin
Huissoon Aarnoud
Humbert Marc
Hurst Jane
in't Veld Anna Huis
James Roger
Johnson Sally
Jolles Stephen
Josifova Dragana
Kazmi Rashid
Keeling David
Kelleher Peter
Kelly Anne M
Kennedy Fiona
Kiely David G
Kingston Nathalie
Kovacs Gabor
Koziell Ania
Krishnakumar Deepa
Kuijpers Taco
Kuijpers Taco W
Kumararatne Dinakantha
Kurian Manju
Laffan Michael A
Lambert Michele P
Lango-Allen Hana
Lawrie Allan
Layton Mark
Lear Sara
Lees Melissa
Lentaigne Claire
Levine Adam P
Liesner Ri
Linger Rachel
Longhurst Hilary
Lorenzo Lorena
Louka Eleni
Lynch Andy G
Machado Rajiv
MackenzieRoss Rob V
MacLaren Robert
Mahdi-Rogers Mohamed
Maher Eamonn
Maimaris Jesmeen
Makris Mike
Mangles Sarah
Manson Ania
Manzur Adnan
Mapeta Rutendo
Markus Hugh S
Marshall Andrew
Martin Jennifer
Martin Jennifer M
Masati Larahmie
Mathias Mary
Matser Vera
Matthews Emma
Maw Anna
McCarthy Mark
McDermott Elizabeth
McGowan Simon
McJannet Coleen
Meacham Stuart
Mead Adam
Meehan Sharon
Megy Karyn
Mehta Sarju
Michaelides Michel
Millar Carolyn M
Moledina Shahin
Montani David
Moore Anthony
Morrell Nicholas W
Mumford Andrew
Murng Sai
Murphy Elaine
Nejentsev Sergey
Noorani Sadia
Noordegraaf Anton Vonk
Nurden Paquita
Oksenhendler Eric
Ormondroyd Liz
Othman Shokri
Ouwehand Willem H
Papadia Sofia
Park Soo-Mi
Parker Alasdair
Pasi John
Patch Chris
Paterson Joan
Payne Jeanette
Peacock Andrew J
Peerlinck Kathelijne
Penkett Christopher J
Pepke-Zaba Joanna
Perry David J
Pollock Val
Polwarth Gary
Ponsford Mark
Qasim Waseem
Quinti Isabella
Ranganathan Lavanya
Rankin Julia
Rankin Stuart
Raymond F Lucy
Rayner-Matthews Paula
Rehnstrom Karola
Reid Evan
Reilly Mary
Renton Tara
Revel-Vilk Shoshana
Rhodes Christopher J
Rice Andrew
Richards Michael
Richardson Sylvia
Richter Alex
Roberts Irene
Rondina Matthew
Rosser Elisabeth
Roughley Catherine
Roy Noemi
Rue-Albrecht Kevin
Saleem Moin
Samarghitean Crina
Sanchis-Juan Alba
Sandford Richard
Santra Saikat
Sargur Ravishankar
Savic Sinisa
Scelsi Laura
Schotte Gwen
Schulman Sol
Schulze Harald
Scott Richard
Scully Marie
Seneviratne Suranjith
Sewell Carrock
Shamardina Olga
Shipley Debbie
Simeoni Ilenia
Sivapalaratnam Suthesh
Smith Kenneth GC
Smith Mike L
Sohal Aman
Soubrier Florent
Southgate Laura
Staines Simon
Staples Emily
Stark Hannah
Stauss Hans
Stein Penelope
Stephens Jonathan
Stirrups Kathleen
Stock Sophie
Stubbs Matthew
Suntharalingam Jay
Tait R Campbell
Talks Kate
Tan Rhea
Tan Yvonne
Thachil Jecko
Thaventhiran James
Themistocleous Andreas
Thomas Ellen
Thomas Moira
Thompson Dorothy
Thrasher Adrian
Tischkowitz Marc
Titterton Catherine
Toh Cheng-Hock
Toshner Mark
Treacy Carmen M
Trembath Richard
Tuna Salih
Turek Wojciech
Turro Ernest
Vale Tom
Van Geet Chris
Van Zuydam Natalie
Veltman Marijke
Vogt Julie
von Ziegenweldt Julie
Wakeling Emma
Walker Sara
Walker Suellen
Wanjiku Ivy
Warner Timothy Q
Wassmer Evangeline
Watkins Hugh
Watson Henry
Watt Christopher
Webster Andrew
Wei Wei
Welch Steve
Westbury Sarah
Wharton John
Whitehorn Deborah
Whitworth James
Wilkins Martin
Willcocks Lisa
Williamson Catherine
Wong Edwin KS
Woods Geoff
Wort Stephen J
Yates Katherine
Yeatman Nigel
Yong Patrick
Young Tim
Yu Ping
Yu-Wai-Man Patrick
Publication venue: Sci Rep
Publication date: 01/01/2018
Field of study

A correction to this article has been published and is linked from the HTML and PDF versions of this paper. The error has been fixed in the paper

University of Liverpool Repository

Maastricht University Research Portal

Directory of Open Access Journals

Apollo (Cambridge)

White Rose Research Online

St George's Online Research Archive

A Comparison of RDB-to-RDF Mapping Languages

Author: Gall Harald C
Hert Matthias
Reif Gerald
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2011
Field of study

Mapping Relational Databases (RDB) to RDF is an active field of research. The majority of data on the current Web is stored in RDBs. Therefore, bridging the conceptual gap between the relational model and RDF is needed to make the data available on the Semantic Web. In addition, recent research has shown that Semantic Web technologies are useful beyond the Web, especially if data from different sources has to be exchanged or integrated. Many mapping languages and approaches were explored leading to the ongoing standardization effort of the World Wide Web Consortium (W3C) carried out in the RDB2RDF Working Group (WG). The goal and contribution of this paper is to provide a feature-based comparison of the state-of-the-art RDB-to-RDF mapping languages. It should act as a guide in selecting a RDB-to-RDF mapping language for a given application scenario and its requirements w.r.t. mapping features. Our comparison framework is based on use cases and requirements for mapping RDBs to RDF as identified by the RDB2RDF WG. We apply this comparison framework to the state-of-the-art RDB-to-RDF mapping languages and report the findings in this paper. As a result, our classification proposes four categories of mapping languages: direct mapping, read-only general-purpose mapping, read-write general-purpose mapping, and special-purpose mapping. We further provide recommendations for selecting a mapping language

Crossref

ZORA